
1. exercise goal and indicator setting
(1) set the goal: ensure that the "high-defense immortal server" has a business availability rate of ≥99.5% when encountering a ddos >=100gbps attack.
(2) key indicators: detection time (target ≤ 30s), automatic cleaning switching time (target ≤ 60s), mttr (target ≤ 15 minutes).
(3) coverage: including vps/host, cdn back-to-origin, bgp multi-line, domain name resolution policy and firewall/acl rules.
(4) resource quantification: the drill needs to call cleaning bandwidth, backup host, dns switching record and record bandwidth and delay data.
(5) compliance and security: the drill script must be carried out on the test network or in collaboration with the isp/cloud vendor whitelist to avoid accidentally damaging public network services.
2. exercise scenario design and procedures
(1) scenario a: simulate a peak syn+udp mixed traffic of 120gbps for 10 minutes to observe the effect of the cleaning strategy.
(2) scenario b: dns amplification cooperates with application layer http flooding to test cdn, return-to-origin protection and caching strategies.
(3) scenario c: link disconnection (isp failure), test bgp switching and multi-line active and backup capabilities.
(4) step refinement: traffic injection → detect alarm → trigger automatic cleaning → dns/traffic switching → back-to-source verification → recovery rollback.
(5) listing operation: the drill script includes firewall issuing commands, nginx speed limit rules, iptables blacklist import and cleanup, and monitoring alarm thresholds.
3. put the monitoring and alarm system into practice
(1) monitoring items: traffic (gbps), number of connections, cpu/ram, response time, packet loss rate and cleaning instance hit rate.
(2) threshold setting: traffic >5gbps triggers primary alarm, >30gbps triggers secondary alarm and automatically reports for cleaning; >80gbps triggers sla for all employees.
(3) alarm link: sms + email + telephone round-robin + automatic creation of work orders to ensure that the operation and maintenance duty responds within 5 minutes.
(4) logs and traceability: save pcap samples, netflow summaries, waf logs and cleaning vendor feedback to facilitate subsequent traceability.
(5) drill inspection: after each drill, the test time and mttr are counted, kpis are formed and embedded in the next improvement plan.
4. real cases and configuration examples
(1) brief description of the case: a hong kong game manufacturer encountered a peak ddos of 128gbps in 2024. after adopting high-defense vps+cdn+bgp multi-line, the business was only affected for 3 minutes and quickly returned to the source.
(2) host configuration example: hk high-defense vps a: 8 vcpu / 32gb ram / 1tb nvme / bandwidth 1gbps (can carry up to 200gbps after cleaning. cleaning bandwidth is provided by isp).
(3) domain name and dns: the primary domain name a record is preset with a low ttl=60s, and the disaster recovery cname points to the cdn to clean the domain name; the backup dns switch is time-consuming to verify during the drill.
(4) waf rules: preset rate limit, abnormal ua discarding, api signature verification and ip black/white list automation.
(5) exercise data table (sample results): the following table shows the comparison of key indicators before and after the exercise.
| item | before drill | after walkthrough (optimization) |
|---|---|---|
| detection time | 45s | 18s |
| automatic cleaning switch | 120s | 40s |
| mttr | 28 minutes | 9 minutes |
| business availability | 98.3% | 99.86% |
5. automation and scripting operation checklist
(1) automation tasks: use ansible/saltstack to implement firewall rule distribution, nginx configuration switching, log collection and recovery scripts.
(2) traffic drill tool: use internal traffic playback or a pressure generator (controlled) in cooperation with a third-party vendor to record pcap and playback to the target ip.
(3) dns automatic switching: call cloud dns or registrar through api to perform a/cname replacement under ttl=60s and verify that it takes effect.
(4) bgp switching: collaborate with isp to preset backup routes and community numbers, issue routing policies and verify rpki/routing convergence during drills.
(5) rollback strategy: each step of the drill must specify the rollback command, person in charge, and rollback window to avoid misoperations that may cause greater impact.
6. review and continuous improvement after the drill
(1) review process: record the event timeline, responsible persons, decision points and time consumption, form a review report and complete it within 48 hours.
(2) data-driven improvements: adjust detection thresholds based on drill table data, shorten automation script execution time, and optimize the monitoring panel.
(3) frequency of training and drills: it is recommended to conduct full-link practical drills at least once every quarter and desktop drills every month.
(4) supplier linkage: sign slas with cleaning services, cdns and isps and conduct regular joint drills to verify cross-vendor switching capabilities.
(5) documentation and standardization: store successful strategies, scripts and blacklist libraries in versions to ensure that any engineer on duty can complete operations under the guidance of sop.
- Latest articles
- Sharing Experience In Handling Dns And Registration Issues During Alibaba Cloud Hong Kong To Cn2 Migration
- How To Organize Internal Training To Improve The Capabilities Of The Operation Team Through The Amazon Operation Japan Exchange Group
- How To Complain And Solve Problems Encountered During The Use Of Vietnam Hotel Youte Server
- Beginner's Guide To The Complete Process Of Building A Rubik's Cube On A Us Server From Environment Preparation To Online Release
- Tianxia Data Vietnam Cloud Server Network Delay And Domestic Internet Optimization Comprehensive Strategy
- Analysis Of Job Responsibilities Adjustment For What Position Is A Vps In The United States Under The Trend Of Remote Working
- The Reason Why E-commerce Platforms Recommend Korean Cloud Servers Is Mainly To Improve User Experience.
- Privacy And Security Guide: How To Protect Your Account And Data When Using Taiwan Vpn Proxy Server
- Malaysian Cn2 Gia Configuration Recommendations And Delay Optimization Methods For Game Acceleration
- Malaysian Cn2 Gia Configuration Recommendations And Delay Optimization Methods For Game Acceleration
- Popular tags
-
How To Choose A Suitable Hong Kong High-defense Server Rental Plan
this article will give you a detailed introduction to how to choose a suitable high-defense server rental solution in hong kong, including key factors and selection tips. -
How To Improve Website Access Speed In Hong Kong Vps Native Ip
this article provides detailed reviews on how hong kong vps native ip can improve website access speed and provide you with the best and cheapest solution. -
Recommendations For Building And Optimization Of Hong Kong Native Ip To Improve Access Speed
discuss effective strategies for building and optimizing hong kong native ip to improve website access speed.